Comprehensive comparisons of ocular biometry: A network-based big data analysis

Purpose To systematically compare and rank ocular measurements with optical and ultrasound biometers based on big data. Methods PubMed, Embase, the Cochrane Library and the US trial registry (www.ClinicalTrial.gov) were used to systematically search trials published up to October 22nd, 2020. We included comparative studies reporting the following parameters measured by at least two devices: axial length (AL), flattest meridian keratometry (Kf), steepest meridian keratometry (Ks), mean keratometry (Km), astigmatism (AST), astigmatism vectors J0 and J45, anterior chamber depth (ACD), aqueous depth (AQD), central corneal thickness (CCT), corneal diameter (CD) and lens thickness (LT). A network-based big data analysis was conducted using STATA version 13.1. Results Across 129 studies involving 17,181 eyes, 12 optical biometers and two ultrasound biometers (with both contact and immersion techniques) were identified. A network meta-analysis for AL and ACD measurements found that statistically significant differences existed when contact ultrasound biometry was compared with the optical biometers. There were no statistically significant differences among the four swept-source optical coherence tomography (SS-OCT) based devices (IOLMaster 700, OA-2000, Argos and ANTERION). As for Ks, Km and CD, statistically significant differences were found when the Pentacam AXL was compared with the IOLMaster and IOLMaster 500. There were statistically significant differences for CCT when the OA-2000 was compared to Pentacam AXL, IOLMaster 700, Lenstar, AL-Scan and Galilei G6. Conclusion For AL and ACD, contact ultrasound biometry obtains the lower values compared with optical biometers. The Pentacam AXL achieves the lowest values for keratometry and CD. The smallest value for CCT measurement is found with the OA-2000. Supplementary Information The online version contains supplementary material available at 10.1186/s40662-022-00320-3.

calculating intraocular lens (IOL) power in cataract surgery [2] and monitoring the progression of myopia. ACD and AQD can be used to assess angle closure glaucoma, monitor anterior segment changes during accommodation and select anterior chamber phakic IOLs. Keratometry is used to calculate the IOL power and for other purposes (e.g., the diagnosis and grading of keratoconus or contact lens fitting). CCT is utilized when considering patients for refractive surgery [3] to reduce the risk of postoperative ectasia. In order to select the most appropriately sized IOL to be placed in the anterior chamber, an accurate measurement of the CD is necessary [4,5]. LT influences the depth of the anterior chamber and can explain the cause and mechanism of glaucoma. It also influences the effective position of the IOL and can be a research topic exploring the pathogenesis and treatment of presbyopia [6].
Many clinical studies have compared these instruments to verify the agreement of their measurements [10][11][12][13][14][15][16][17][18]. However, there is no definite conclusion about the comparison of all instruments as a single comparative study. In addition, there is no study that compared all instruments at the same time. The purpose of this networkbased big data analysis is to systematically review the existing evidence and compare the measurement differences among all optical and ultrasound biometers as well as to guide clinical decisions.

Methods
This systematic review complies with the preferred reporting items for systematic reviews and meta-analyses (PRISMA) network meta-analysis extension statement [19].

Search methods
A systematic literature review was conducted using Pub-Med, Embase, the Cochrane Library and the US trial registry (www. Clini calTr ial. gov) published up to 22nd, October 2020. The full search strategies are shown in Additional file 1: Appendix I. We also manually examined the reference lists of clinical trials, related meta-analyses and systematic reviews to identify relevant studies.

Outcome measurements
The following parameters were assessed in this review: AL (mm), Kf (D), Ks (D), Km (D), AST (D), J 0 (D), J 45 (D), ACD (mm), AQD (mm), CCT (μm), CD (mm), and LT (mm). Original parameters were obtained from the articles as far as possible and parameters that could not be obtained were calculated if possible.

Study selection and data extraction
Screening was performed by two independent investigators (YW, TW). They retrieved full-text articles that appeared relevant after reviewing the titles and abstracts. They independently assessed full-text articles for final eligibility. Any discrepancies were resolved by focused discussion or consultation with an additional investigator (JY). Two investigators (YW, TW) independently extracted information into an electronic database, including the author, the publication time, outcomes, and quantitative results for treatment effects. For data that were missing or could not be directly obtained, we contacted the authors of the trial reports or used GetData GraphDigitizer 2.24 (http:// getda ta-graph-digit izer. com) to obtain data from figures.

Risk of bias assessment
To evaluate the study quality, we used the Quality Assessment of Diagnostic Accuracy Studies (QUADAS) tool for diagnostic studies, which has been strictly evaluated, verified, and recommended by the Cochrane Library. In this method, a total of 14 items were evaluated by "Yes", "No" or "Uncertain". In 2008, according to the opinions of the screening and diagnostic research methodology group of the Cochrane Library, items 3, 8 and 9 of QUADAS were included in the unnecessary evaluation items. Therefore, the remaining 11 items were chosen to assess study quality [20].

Statistical analysis
STATA statistical software (version 13.0, Stata Corporation, College Station, TX, USA) was used to perform statistical analyses. For binary outcomes, relative effect sizes were calculated as odds ratios (OR) with 95% confidence intervals (CI). For continuous outcomes, relative effect sizes were calculated as weighted mean differences (WMD) with 95% CI. We used visual inspection of the I 2 statistic [21] (value of 50% or more indicated substantial heterogeneity) to investigate the possibility of statistical heterogeneity. To incorporate indirect comparisons, we performed network meta-analyses using the mvmeta command in STATA version 13.1 [22] to estimate pooled ORs and WMD with 95% credible intervals (CrI). We ranked instruments based on the analysis of ranking probabilities and the surface under the cumulative ranking curve (SUCRA) [23]. The SUCRA values, expressed as a percentage, show the relative probability of an instrument to get the maximum parameters' value. Inconsistency between direct and indirect evidence was assessed by a "node-splitting" approach and the designby-treatment interaction model assuming consistency throughout the entire network [24]. In order to explore the potential sources of heterogeneity and inconsistency, we performed a subgroup analyses comparing two population groups: healthy vs. diseased (cataract). To avoid the potential influences of age and AL on the measurement results, we limited the subgroups to adults and normal AL range (22 to 26 mm). A funnel plot was used to evaluate publication bias in the results between small and large studies [25]. We also performed additional comparison between the groups according to the principle of the measurements.

Literature selection results
This initial literature search yielded 4854 papers. After duplicates were excluded, 3322 studies remained. Of these, 127 studies matched the inclusion criteria, and 7 additional single papers were added from other reference sources listed above. Five of the 134 papers were excluded as they were reviews or letters rather than comparative studies, or they did not include any primary or secondary outcome data. Ultimately, 129 studies met our criteria and were included in our network meta-analysis ( Fig. 1).

Study characteristics and network geometry
A summary of all eligible studies published until 2020 is shown in the Additional file 1: Appendix I Table S1. A total of 17,181 eyes were measured by one of 12 optical biometry and ultrasound biometry (with both contact and immersion techniques), with a total of five different techniques (Fig. 2). Almost all trials involved only two devices (92.2%). Among the included 129 trials, 43 (33.3%) recruited healthy or ametropia subjects, 85 (65.9%) recruited participants with cataract, 3 (2.3%) recruited participants who underwent cataract surgery, 1 (0.78%) recruited participants with glaucoma, 1 (0.78%) recruited participants with keratoconus, 1 (0.78%) recruited participants with silicone-filled eyes, and 5 (3.9%) recruited mixed participants.

Risk of bias assessment results
The risk of bias from the trials included in our study is shown in Additional file 1: Appendix I Table S2. The evaluation of some trials in items 1-5 were "No" or "Not clear", but all trials gained the full "Yes" for items 6-11. In general, all trials were regarded as high-quality.

Results of meta-analysis Direct comparisons
Figures 3, 4, 5 and 6 (upper right) and Additional file 1: Appendix I Tables S3-S14 show the direct comparisons between each pair of instruments. In total, 112 studies involving 14 instruments were available for the comparison of the AL. Direct comparisons found that contact ultrasound measured shorter AL when compared with the IOLMaster (WMD = − 0.159 mm). With regards to measurements of Kf, Ks and astigmatism, there were no statistically significant differences among the various instruments. With respect to the Km, statistically significant differences existed when the Pentacam AXL was compared with the IOLMaster 500 (WMD = − 0.235 D) and the Lenstar (WMD = − 0.233 D). When considering the ACD, statistically significant differences existed when contact ultrasound was compared with the IOLMaster (WMD = − 0.133 mm), the IOLMaster 700 (WMD = − 0.13 mm), and the OA-1000 (WMD = − 0.47 mm). Besides, there were statistically significant differences between the IOL-Master 700 and the following devices (WMD from large to small): Argos (WMD = − 0.113 mm), ANTERION (WMD = − 0.07 mm), and Lenstar (WMD = − 0.019 mm). We also found that the Lenstar obtained higher CCT measurements when compared to the OA-2000 (WMD = 13.683 μm) and the Pentacam AXL (WMD = 9.071 μm). There was also a statistical difference between the OA-2000 and the Pentacam AXL (WMD = − 8.42 μm). As for the measurement of the CD, there were no significant differences among the devices except the Lenstar and the IOLMaster 700, the IOLMaster and the Lenstar, the IOLMaster 500 and the OA-2000, the Galilei G6 and the IOLMaster 700, the IOLMaster 500 and the Pentacam AXL, the IOLMaster 500 and the IOLMaster 700, the IOLMaster 700 and the ANTERION.     Fig. S1 and Additional file 1: Appendix I Table S15).

Combination of direct and indirect comparisons
The results of the keratometry findings from the network meta-analyses are shown in Fig. 4. With respect to Kf, there was no statistically significant difference among the optical biometers (P > 0.05). The  S2 and Additional file 1: Appendix I Tables S16-S18). Figure 5 shows the results for astigmatism. We found that there were no statistically significant differences between any of the studied instruments (P > 0.05) considering the AST, J 0 and J 45 . As for the ranking results, the Lenstar obtained the maximum measured value of AST and J 0 (70.9%, 65.4%, respectively), and got the minimum measured value of J 45 (25%) (Additional file 2: Figure S3 and Additional file 1: Appendix I Tables S19-S21).
The results of ACD, AQD and CCT are shown in Fig. 6. When considering the ACD, statistically significant differences existed between contact ultrasound biometry and the following devices (WMD from large to small

Inconsistency
Node-splitting analysis between contact ultrasound biometry and the Lenstar for closed-loop comparisons in terms of AL showed significant inconsistency (P < 0.05). Similar results included: the Lenstar and the OA-2000 for Kf, the IOLMaster and the Lenstar for Kf and AST, the IOLMaster and contact ultrasound biometry for ACD, the Lenstar and contact ultrasound biometry for ACD, the IOLMaster and the OA-2000 for ACD, the IOLMaster 500 and the OA-2000 for ACD, the AL-Scan and the Lenstar for CD, the Argos and the Lenstar for CD, the Argos and the IOLMaster 700 for CD. We also used the design-by-treatment interactions model and found that global inconsistency existed for Kf, ACD, CCT and CD (P = 0.0041, P < 0.001, P < 0.001, P < 0.001, respectively) (Additional file 1: Appendix I Tables S27-S38).

Subgroup analysis
The results of the subgroup analysis also found no global inconsistency existing for AL, Ks, Km, AST, J 0 , J 45 , AQD and LT, and did not significantly change the results of the original network meta-analysis. There were 14 trials involving 9 instruments in the subgroups for the Kf measurement in cataract subjects. This process produced no significant inconsistency in the results. Statistically significant differences existed between the OA-2000 and the Pentacam AXL (WMD = 0.4 D); the OA-2000 and the Lenstar (WMD = 0.28 D) (full process and data shown in Additional file 3: Appendix II Tables S1-S18 and Additional file 3: Appendix II Tables S22-S39). Taking ACD into consideration, the subgroup in healthy subjects prompted no significant inconsistency in the results. Statistically significant differences only existed between the OA-2000 and the IOLMaster 500 (WMD = 0.07 mm); the Lenstar and the IOLMaster (WMD = 0.08 mm); the Pentacam AXL and contact ultrasound biometry (WMD = 0.13 mm); the Argos and contact ultrasound biometry (WMD = 0.18 mm); the OA-2000 and contact ultrasound biometry (WMD = 0.10 mm); the Lenstar and contact ultrasound biometry (WMD = 0.11 mm); the IOLMaster and contact ultrasound biometry (WMD = 0.06 mm). For CCT and CD, the subgroup in healthy subjects both found no significant inconsistency in the results. When considering the measurement of the CCT, there was no statistically significant difference between the Argos and the IOLMaster 700, which differs from the network meta-analysis. As for the measurement of the CD, statistically significant differences only existed when the OA-2000 was compared to the Lenstar and the IOLMaster.
Since there were global inconsistencies noted for Kf, ACD and CCT, we further performed comparison between groups according to the principle of the measurements. With respect to Kf, there was no statistically significant difference among the different measurement principles (P > 0.05). The principle of the measurements was ranked consulting the maximum to minimum Kf values depending on the SUCRA values: automated keratometer (AL-Scan, IOLMaster, IOLMaster 500, IOLMaster 700, Lenstar), Placido (Galilei G6, OA-2000, Aladdin), Scheimpflug (Pentacam AXL). The results were consistent with the results of the original network meta-analysis. When considering the measurement of the CCT, there was also no statistically significant difference among the different principles (P > 0.05). The principle of the measurements was ranked consulting the maximum to the minimum CCT values depending on the SUCRA values: A-Scan ultrasound (contact ultrasound), Scheimpflug (Galilei G6, AL-Scan, Pentacam AXL), OLCR (Lenstar), SS-OCT (IOLMaster 700, ANTERION, Argos, OA-2000), OLCI (Aladdin). The results are essentially in agreement with the results of the original network meta-analysis. For the ACD, there were statistically significant differences between the A-Scan ultrasound and the following principle: PCI, OLCR, OLCI, SS-OCT, Scheimpflug. Statistically significant differences also existed between the SS-OCT and the PCI. These results were consistent with the results from the original network meta-analysis (Additional file 3: Appendix II  Tables S19-S21 and Additional file 3: Appendix II Tables  S40-S42).

Publication bias
Comparison-adjusted funnel plots for each parameter are provided in Additional file 1: Appendix I Figs. S1-S12. Most of these plots except ACD showed that the included studies lie symmetrically around the "0" line (vertical line). However, the significant publication bias in the ACD did not show up when we performed subgroup analysis for the ACD measurement in healthy subjects (Additional file 3: Appendix II Figs. S1-S18).

Discussion
This is the first network-based big data meta-analysis that comprehensively compares the instruments and techniques used for ophthalmic biometry. We performed an in-depth statistical comparison of 12 optical instruments and two ultrasound biometry methods by combining the data from 129 studies involving 17,181 eyes. The network meta-analysis demonstrated that when considering the measurement of AL and ACD, contact ultrasound biometry obtained lower values compared to all optical biometers. When considering the measurement of LT, contact ultrasound biometry obtained larger values compared to Galilei G6, IOLMaster 700 and Lenstar. Looking at the four SS-OCT based devices (IOLMaster 700, OA-2000, Argos and ANTERION), no statistical differences existed. In addition, the Pentacam AXL achieved the lowest values of the keratometry and CD. As for the AST, J 0 and J 45 , there were no statistically significant differences among the instruments included in this study. Besides, we found that the lowest value of CCT measurement was given by the OA-2000, compared with the following instruments: IOLMaster 700, Lenstar, Pentacam AXL, AL-Scan and Galilei G6.
Many studies found that A-scan contact ultrasound biometry measured smaller AL and ACD and larger LT values compared to optical biometers which is consistent with our conclusion [26][27][28][29][30]. The discrepancy for AL and ACD occurs because with contact ultrasound biometry the probe is likely to compress the cornea; with regards to LT, the difference may depend on the index of refraction used by optical biometers to convert the optical path length into a geometrical distance [28].
SS-OCT has some advantages over other optical technologies used for optical biometry, such as long-range OCT imaging or deeper light penetration [31]. Montes-Mico et al. [32] summarized the outcomes reported among four SS-OCT based devices (IOLMaster 700, OA-2000, Argos and ANTERION), and found that the mean differences in AL, ACD and LT measurements for repeatability and reproducibility among the four devices were close to zero. Moreover, many studies reported that agreement between these devices was good. Here, our results are in tandem with previous findings.
Our study also found that the minimum value of Km, Kf and Ks measurement were all given by the Pentacam AXL. It was worth mentioning that the mean K value was a little flatter when measured by the Pentacam AXL compared to the Lenstar [33]. Maria Muzyka-Woźniak et al. [34] also reported that flatter K values were obtained with the Pentacam AXL in comparison to the IOLMaster 500. The Pentacam AXL measures K values at 138,000 reference points orientated in circles at approximately 3.0 mm optical zones on the cornea, which is different with other devices; it is the only instrument that does not rely on corneal reflection [35]. As for J 0 , J 45 and AST, the network meta-analysis results showed no statistically significant differences among the following devices: OA-2000, IOL-Master, Argos, IOLMaster 500, Aladdin, Pentacam AXL, AL-scan, IOLMaster 700 and Lenstar.
In this study, the lowest value of CCT measurement was given by the OA-2000 when compared with the following instruments: IOLMaster 700, Lenstar, Pentacam AXL, AL-Scan and Galilei G6. The maximum value of CCT measurement was obtained by the Galilei G6 (according to the SUCRA). The difference may be explained by the differences in algorithms and analysis programs of the two devices in boundary determination. The Galilei G6 CCT uses the Scheimpflug principle and measures CCT from the air-tear film surfaces to the posterior corneal surface. The OA-2000 uses a 1060 nm swept source laser to measure the CCT from the anterior corneal surface to the posterior corneal surface. Since the former technology can measure beyond the anterior surface, corneal thickness and posterior corneal curvature can be evaluated with high precision [36].
Here, the Aladdin and Pentacam AXL gave lower values of CD compared to the IOLMaster 700, IOLMaster 500, Lenstar and IOLMaster. There was no statistically significant difference between the Aladdin and the Pentacam AXL. In addition, according to the SUCRA, the IOLMaster was most likely to obtain the maximum CD value. Sabatino et al. [37] described that the IOLMaster produced a greater mean value for CD than the Aladdin. Huang et al. [15] and Cruysberg et al. [38] arrived at the same conclusion. Further, Yeu et al. [39] found that the CD distance showed statistically significant differences (− 0.4 mm on average) between the Aladdin and the Lenstar. This may be attributed to Aladdin's use of corneal topography whereas the IOLMaster uses photographic techniques to determine the CD [18,40]. Based on these results, the CD measurements with the Aladdin and the IOLMaster could not be used interchangeably.
In relation to Kf, ACD, CCT and CD measurements, our results indicate that there is too much heterogeneity to draw reliable conclusions. Differences in population and AL may influence the measurements [38,41]. Therefore, we performed subgroup analyses in two population groups: healthy vs. diseased (cataract), and limited the subgroups to adults and normal AL range (22 to 26 mm). The subgroup analysis for the Kf measurement in cataract subjects found no significant inconsistency. However, there was a marked inconsistency amongst the healthy subjects, which can be due to few studies that have directly compared healthy subjects with these devices (such as the OA-2000 vs. the IOLMaster). Regarding ACD, CCT and CD, we also conducted subgroup analyses and found no inconsistency in healthy subjects but a marked inconsistency amongst the cataract subjects. This may be due to the different wavelength in the light source used by the various devices, thus causing the results to be largely affected by the different degree of turbidity of the refractive medium. The various degrees of cataractous lens opacification (cortical, nuclear or posterior subcapsular) may be the cause of inconsistency in cataract subjects [42]. However, since the included articles lack sufficient data for this type of subgroup analysis, it is recommended that future studies could pay more attention to this aspect.
Our study also had other limitations. There were some differences in characteristics of included studies (such as the racial diversity of studied populations, varying degrees of sample size, quality of study methods employed, operator competency, the time interval between equipment measurements and publication bias) that may influence both heterogeneity in direct comparisons and transitivity in indirect comparison in subgroup analyses. To explore the possible impact of these factors on the results, more high-quality studies with concordant features are needed to enhance the statistical effectiveness and quality of evidence in the future. Since new ophthalmic technologies are invented continuously, we have not included all the available instruments in clinical practice, but only focused on the anterior segment and AL biometry.

Conclusion
This network-based big data analysis demonstrated that when considering the measurement of AL and ACD, contact ultrasound biometry obtains lower values compared with optical biometers. For LT, contact ultrasound biometry obtains larger values compared with Galilei G6, IOLMaster 700 and Lenstar. The Pentacam AXL was also shown to achieve the lowest values with respect to keratometry and CD. Additionally, it was demonstrated that the lowest value of CCT measurement was given by the OA-2000, compared with the following instruments: IOLMaster 700, Lenstar, Pentacam AXL, AL-Scan, and Galilei G6.